Clean speech reconstruction from noisy mel-frequency cepstral coefficients using a sinusoidal model

نویسندگان

  • Xu Shao
  • Ben P. Milner
چکیده

This paper extends the technique of speech reconstruction from MFCCs by considering the effect of noisy speech. To reconstruct a clean speech signal from noise contaminated MFCCs an estimate of the clean mel-filterbank vector is required together with a robust estimate of the pitch. This work applies spectral subtraction to the mel-filterbank vector (derived from noisy MFCCs) to provide a clean speech spectral estimate. To obtain a reliable estimate of pitch a robust extraction technique is used. Spectrograms and informal listening tests reveal that a clean speech signal can be successfully reconstructed from the noisy MFCCs. Pitch errors are shown to manifest themselves as artificial sounding bursts in the reconstructed speech signal. Incorrect estimates of the spectral envelope introduce periods of noise into the reconstructed speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust algorithms for speech reconstruction on mobile devices

This thesis is concerned with reconstructing an intelligible time-domain speech signal from speech recognition features, such as Mel-frequency cepstral coefficients (MFCCs), in a distributed speech recognition(DSR) environment. The initial reconstruction methods in this thesis require, in addition to MFCC vectors, fundamental frequency and voicing information. In the later parts of the thesis t...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Throat Microphone for Speaker Recognition Using AANN

In this paper, we have analyzed the performance of speaker recognition system based on features extracted from the speech recorded using throat microphone in clean and noisy environment. In general, clean speech performs better for speaker recognition system. Speaker recognition in noisy environment, using transducer held at the throat results in a signal that is clean even in noisy. This speak...

متن کامل

On compensating the Mel-frequency cepstral coefficients for noisy speech recognition

This paper describes a novel noise-robust automatic speech recognition (ASR) front-end that employs a combination of Mel-filterbank output compensation and cumulative distribution mapping of cepstral coefficients with truncated Gaussian distribution. Recognition experiments on the Aurora II connected digits database reveal that the proposed front-end achieves an average digit recognition accura...

متن کامل

Feature Normalisation for Robust Speech Recognition

Speech recognition system performance degrades in noisy environments. If the acoustic models (HMMs) for speech are built using features of clean utterances, the features of a noisy test utterance would be acoustically mismatched with the trained model. This gives poor likelihood values and poor recognition accuracy. Model adaptation and feature normalisation are two broad areas that address thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003